IEICE global.ieice.org Site

Keyword Search Result

[Keyword] low power(376hit)

121-140hit(376hit)

Low-Power Embedded Processor Design Using Branch Direction
Gi-Ho PARK Jung-Wook PARK Gunok JUNG Shin-Dug KIM

LETTER-High-Level Synthesis and System-Level Design

Vol:
E92-A No:12
Page(s):
3180-3181
This paper presents a wordline gating logic for reducing unnecessary BTB accesses. Partial bit of the branch predictor was simultaneously recorded in the middle of BTB to prevent further SRAM operation. Experimental results with embedded applications showed that the proposed mechanism reduces around 38% of BTB power consumption.
An Approach for Reducing Leakage Current Variation due to Manufacturing Variability
Tsuyoshi SAKATA Takaaki OKUMURA Atsushi KUROKAWA Hidenari NAKASHIMA Hiroo MASUDA Takashi SATO Masanori HASHIMOTO Koutaro HACHIYA Katsuhiro FURUKAWA Masakazu TANAKA Hiroshi TAKAFUJI Toshiki KANAMOTO

PAPER-Device and Circuit Modeling and Analysis

Vol:
E92-A No:12
Page(s):
3016-3023
Leakage current is an important qualitative metric of LSI (Large Scale Integrated circuit). In this paper, we focus on reduction of leakage current variation under the process variation. Firstly, we derive a set of quadratic equations to evaluate delay and leakage current under the process variation. Using these equations, we discuss the cases of varying leakage current without degrading delay distribution and propose a procedure to reduce the leakage current variations. From the experiments, we show the proposed method effectively reduces the leakage current variation up to 50% at 90 percentile point of the distribution compared with the conventional design approach.
Design of Voltage-Mode MAX-MIN Circuits with Low Area and Low Power Consumption
Mohammad SOLEIMANI Abdollah KHOEI Khayrollah HADIDI Vahid Fagih DINAVARI

PAPER-Device and Circuit Modeling and Analysis

Vol:
E92-A No:12
Page(s):
3044-3051
In this paper, new structure of Voltage-Mode MAX-MIN circuit are presented for nonlinear systems, fuzzy applications, neural network and etc. A differential pair with improved cascode current mirror is used to choose the desired input. The advantages of the proposed structure are high operating frequency, high precision, low power consumption, low area and simple expansion for multiple inputs by adding only three transistors for each extra input. The proposed circuit which is simulated by HSPICE in 0.35 µm CMOS process shows the total power consumption of 85 µW in 5 MHz operating frequency from a single 3.3-V supply. Also, the total area of the proposed circuit is about 420 µm2 for two input voltages, and would be negligibly increased for each extra input.
Ultra Low Power Delay Element with Post-Chip Adjustable Ability
Jung-Lin YANG Chih-Wei CHAO

PAPER-VLSI Design Technology and CAD

Vol:
E92-A No:12
Page(s):
3381-3389
Our paper proposes a low power delay element with many other valuable characteristics for asynchronous circuits in the bundled-data implementation. Delay elements are frequently utilized to interact with asynchronous environment for revealing the current status of the bundled-data asynchronous circuits. Thus, a notable portion of the total energy is consumed by the delay elements for this kind of designs. Moreover, constructing a specific delay on a chip is a difficult task for recent CMOS technology. An extreme low power asymmetrical delay element with post-chip adjustment feature was developed mainly for solving these issues. Our initial intention was to develop a programmable delay element for asynchronous data path components. The proposed delay element is also suitable for many other applications requiring low power constraint. In addition to the programmability, the delay element also demonstrated efficiently characteristics such as good tolerance to process and temperature variations on the delay. Our delay element is equivalent to approximately the average power of a 4-stage inverter chain. A large delay can be obtained by cascaded scheme with nearly zero handshaking overhead. All arguments were cautiously verified by the post-layout simulation setup using TSMC 0.35 µm and 0.18 µm technologies under all extreme corners.
Energy-Efficient Pre-Execution Techniques in Two-Step Physical Register Deallocation
Kazunaga HYODO Kengo IWAMOTO Hideki ANDO

PAPER-Computer Systems

Vol:
E92-D No:11
Page(s):
2186-2195
Instruction pre-execution is an effective way to prefetch data. We previously proposed an instruction pre-execution scheme, which we call two-step physical register deallocation (TSD). The TSD realizes pre-execution by exploiting the difference between the amount of instruction-level parallelism available with an unlimited number of physical registers and that available with an actual number of physical registers. Although previous TSD study has successfully improved performance, it still has an inefficient energy consumption. This is because attempts are made for instructions to be pre-executed as much as possible, independently of whether or not they can significantly contribute to load latency reduction, allowing for maximal performance improvement. This paper presents a scheme that improves the energy efficiency of the TSD by pre-executing only those instructions that have great benefit. Our evaluation results using the SPECfp2000 benchmark show that our scheme reduces the dynamic pre-executed instruction count by 76%, compared with the original scheme. This reduction saves 7% energy consumption of the execution core with 2% overhead. Performance degrades by 2%, compared with that of the original scheme, but is still 15% higher than that of the normal processor without the TSD.
Power Minimization for Dual- and Triple-Supply Digital Circuits via Integer Linear Programming
Ki-Yong AHN Chong-Min KYUNG

PAPER-VLSI Design Technology and CAD

Vol:
E92-A No:9
Page(s):
2318-2325
This paper proposes an Integer Linear Programming (ILP)-based power minimization method by partitioning into regions, first, with three different VDD's(PM3V), and, secondly, with two different VDD's(PM2V). To reduce the solving time of triple-VDD case (PM3V), we also proposed a partitioned ILP method(p-PM3V). The proposed method provides 29% power saving on the average in the case of triple-VDD compared to the case of single VDD. Power reduction of PM3V compared to Clustered Voltage Scaling (CVS) was about 18%. Compared to the unpartitioned ILP formulation(PM3V), the partitioned ILP method(p-PM3V) reduced the total solution time by 46% at the cost of additional power consumption within 1.3%.
A Low-Power High Accuracy Over Current Protection Circuit for Low Dropout Regulator
Socheat HENG Cong-Kha PHAM

PAPER-Electronic Circuits

Vol:
E92-C No:9
Page(s):
1208-1214
In this paper, a low power current protection circuit implemented in a low dropout regulator (LDO) is presented. The proposed circuit, designed in a 0.35 µm CMOS process, provides a precise limiting current as well as holding current with low dependency on both supply voltage and regulator output voltage. The experimental results showed that the proposed circuit is operable in the regulator output voltage range from VOUT=1.2 V to VOUT=3.6 V and supply voltage range from VDD=VOUT+0.5 V to VDD=5.6 V. Since the proposed circuit is composed of few simple basic circuits such as a comparator and a Schmitt Trigger, it has a low current consumption of less than ISS=0.82 µA at a load current of ILOAD=200 mA. This makes the circuit suitable for low power and low voltage LDO design.
A 0.31 pJ/Conversion-Step 12-Bit 100 MS/s 0.13 µm CMOS A/D Converter for 3G Communication Systems
Young-Ju KIM Kyung-Hoon LEE Myung-Hwan LEE Seung-Hoon LEE

PAPER-Electronic Circuits

Vol:
E92-C No:9
Page(s):
1194-1200
This work describes a 12-bit 100 MS/s 0.13 µm CMOS ADC for 3G wireless communication systems such as two-carrier W-CDMA applications. The proposed ADC employs a four-step pipeline architecture to optimize power consumption and chip area at the target resolution and sampling rate. Area-efficient gate-bootstrapped sampling switches of the input SHA maintain high signal linearity over the Nyquist rate even at a 1.0 V supply. The cascode compensation using a low-impedance feedback path in two-stage amplifiers of the SHA and MDACs achieves the required conversion speed and phase margin with less power consumption and area compared to the Miller compensation. A low-glitch dynamic latch in the sub-ranging flash ADCs reduces kickback noise referred to the input of comparator by isolating the pre-amplifier from the regeneration latch output. The proposed on-chip current and voltage references are based on triple negative TC circuits. The prototype ADC in a 0.13 µm 1P8M CMOS technology demonstrates the measured DNL and INL within 0.38LSB and 0.96LSB at 12-bit, respectively. The ADC shows a maximum SNDR and SFDR of 64.5 dB and 78.0 dB at 100 MS/s, respectively. The ADC with an active die area of 1.22 mm2 consumes 42.0 mW at 100 MS/s and a 1.2 V supply, corresponding to a figure-of-merit of 0.31 pJ/conversion-step.
Architectural Exploration and Design of Time-Interleaved SAR Arrays for Low-Power and High Speed A/D Converters
Sergio SAPONARA Pierluigi NUZZO Claudio NANI Geert VAN DER PLAS Luca FANUCCI

PAPER

Vol:
E92-C No:6
Page(s):
843-851
Time-interleaved (TI) analog-to-digital converters (ADCs) are frequently advocated as a power-efficient solution to realize the high sampling rates required in single-chip transceivers for the emerging communication schemes: ultra-wideband, fast serial links, cognitive-radio and software-defined radio. However, the combined effects of multiple distortion sources due to channel mismatches (bandwidth, offset, gain and timing) severely affect system performance and power consumption of a TI ADC and need to be accounted for since the earlier design phases. In this paper, system-level design of TI ADCs is addressed through a platform-based methodology, enabling effective investigation of different speed/resolution scenarios as well as the impact of parallelism on accuracy, yield, sampling-rate, area and power consumption. Design space exploration of a TI successive approximation ADC is performed top-down via Monte Carlo simulations, by exploiting behavioral models built bottom-up after characterizing feasible implementations of the main building blocks in a 90-nm 1-V CMOS process. As a result, two implementations of the TI ADC are proposed that are capable to provide an outstanding figure-of-merit below 0.15 pJ/conversion-step.
Design of 1 V Operating Fully Differential OTA Using NMOS Inverters in 0.18 µm CMOS Technology
Atsushi TANAKA Hiroshi TANIMOTO

PAPER

Vol:
E92-C No:6
Page(s):
822-827
This paper presents a 1 V operating fully differential OTA using NMOS inverters in place of the traditional differential pair. To obtain high gain, a two-stage configuration is used in which the first stage has feedforward paths to cancel the common-mode signal, and the second stage has common-mode feedback paths to stabilize the output common-mode voltage. The proposed OTA was fabricated by an 0.18 µm CMOS technology. Measured gain is 40 dB and GBW is 10 MHz, in addition to differential output voltage swing of 1.8 Vp - p. It is confirmed that the proposed OTA can operate from 1 V power supply and has very large output swing capability even in a 1 V operation. The proposed OTA configuration contributes to a solution to the low power supply voltage issue in scaled CMOS analog circuits.
A Low Power Reconfigurable Channel Filter Using Multi-Band and Masking Architecture for Channel Adaptation in Cognitive Radio
K. G. SMITHA A. P. VINOD

PAPER-Digital Signal Processing

Vol:
E92-A No:6
Page(s):
1424-1432
Cognitive radio (CR) is an adaptive spectrum sharing paradigm targeted to provide opportunistic spectrum access to secondary users for whom the frequency bands have not been licensed. The key tasks in a CR are to sense the spectral environment over a wide frequency band and allow unlicensed secondary users (CR users) to dynamically transmit/receive data over frequency bands unutilized by licensed primary users. Thus the CR transceiver should dynamically adapt its channel (frequency band) in response to the time-varying frequencies of wideband signal for seamless communication. In this paper, we present a low complexity reconfigurable filter architecture based on multi-band filtering and frequency masking techniques for dynamic channel adaptation in CR terminal. The proposed multi-standard architecture is capable of adapting to channels having different bandwidths corresponding to the channel spacing of time-varying channels. Design examples show that proposed architecture offers 12.2% power reduction and 26.5% average gate count reduction over conventional Per-Channel based architecture.
Design of Low Power QPP Interleave Address Generator Using the Periodicity of QPP
Won-Ho LEE Chong Suck RIM

LETTER-VLSI Design Technology and CAD

Vol:
E92-A No:6
Page(s):
1538-1540
This paper presents two power-saving designs for Quadratic Polynomial Permutation (QPP) interleave address generator of which interleave length K is fixed and unfixed, respectively. These designs are based on our observation that the quadratic term f2x2%K of f(x) = (f1x+f2x2)%K, which is the QPP address generating function, has a short period and is symmetric within the period. Power consumption is reduced by 27.4% in the design with fixed-K and 5.4% in the design with unfixed-K on the average for various values of K, when compared with existing designs.
Reducing On-Chip DRAM Energy via Data Transfer Size Optimization
Takatsugu ONO Koji INOUE Kazuaki MURAKAMI Kenji YOSHIDA

PAPER

Vol:
E92-C No:4
Page(s):
433-443
This paper proposes a software-controllable variable line-size (SC-VLS) cache architecture for low power embedded systems. High bandwidth between logic and a DRAM is realized by means of advanced integrated technology. System-in-Silicon is one of the architectural frameworks to realize the high bandwidth. An ASIC and a specific SRAM are mounted onto a silicon interposer. Each chip is connected to the silicon interposer by eutectic solder bumps. In the framework, it is important to reduce the DRAM energy consumption. The specific DRAM needs a small cache memory to improve the performance. We exploit the cache to reduce the DRAM energy consumption. During application program executions, an adequate cache line size which produces the lowest cache miss ratio is varied because the amount of spatial locality of memory references changes. If we employ a large cache line size, we can expect the effect of prefetching. However, the DRAM energy consumption is larger than a small line size because of the huge number of banks are accessed. The SC-VLS cache is able to change a line size to an adequate one at runtime with a small area and power overheads. We analyze the adequate line size and insert line size change instructions at the beginning of each function of a target program before executing the program. In our evaluation, it is observed that the SC-VLS cache reduces the DRAM energy consumption up to 88%, compared to a conventional cache with fixed 256 B lines.
A Way Enabling Mechanism Based on the Branch Prediction Information for Low Power Instruction Cache
Gi-Ho PARK Jung-Wook PARK Hoi-Jin LEE Gunok JUNG Sung-Bae PARK Shin-Dug KIM

LETTER

Vol:
E92-C No:4
Page(s):
517-521
This paper presents a cache way enabling mechanism using branch target addresses. This mechanism uses branch prediction information to avoid the power consumption due to unnecessary cache way access by enabling only the cache way(s) that should be accessed. The proposed cache way enabling mechanism reduces the power consumption of the instruction cache by 63% without any performance degradation of the processor. An ARM1136 processor simulator and the Synopsys PrimeTime are used to perform the performance/power simulation and static timing analysis of the proposed mechanisms respectively.
A Reference Voltage Buffer with Settling Boost Technique for a 12 bit 18 MHz Multibit/Stage Pipelined A/D Converter
Shunsuke OKURA Tetsuro OKURA Toru IDO Kenji TANIGUCHI

PAPER

Vol:
E92-A No:2
Page(s):
367-373
A reference voltage buffer for a multibit/stage pipelined ADC is described, where a settling boost technique is used to improve the settling response of the pipelined stages. A 12 bit 18 MHz pipelined ADC with the buffer is designed and simulated based on a 0.35 µm CMOS process. According to simulation results, the power consumed by the reference voltage buffer is reduced by 33% compared to that without the settling boost technique.
Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages
Hasitha Muthumala WAIDYASOORIYA Masanori HARIYAMA Michitaka KAMEYAMA

PAPER-High-Level Synthesis and System-Level Design

Vol:
E91-A No:12
Page(s):
3596-3606
This paper presents a high-level synthesis approach to minimize the total power consumption in behavioral synthesis under time and area constraints. The proposed method has two stages, functional unit (FU) energy optimization and interconnect energy optimization. In the first stage, active and inactive energies of the FUs are optimized using a multiple supply and threshold voltage scheme. Genetic algorithm (GA) based simultaneous assignment of supply and threshold voltages and module selection is proposed. The proposed GA based searching method can be used in large size problems to find a near-optimal solution in a reasonable time. In the second stage, interconnects are simplified by increasing their sharing. This is done by exploiting similar data transfer patterns among FUs. The proposed method is evaluated for several benchmarks under 90 nm CMOS technology. The experimental results show that more than 40% of energy savings can be achieved by our proposed method.
Cross-Layer Design for Low-Power Wireless Sensor Node Using Wave Clock
Takashi TAKEUCHI Yu OTAKE Masumi ICHIEN Akihiro GION Hiroshi KAWAGUCHI Chikara OHTA Masahiko YOSHIMOTO

PAPER

Vol:
E91-B No:11
Page(s):
3480-3488
We propose Isochronous-MAC (I-MAC) using the Long-Wave Standard Time Code (so called "wave clock"), and introduce cross-layer design for a low-power wireless sensor node with I-MAC. I-MAC has a periodic wakeup time synchronized with the actual time, and thus we take the wave clock. However, a frequency of a crystal oscillator varies along with temperature, which incurs a time difference among nodes. We present a time correction algorithm to address this problem, and shorten the time difference. Thereby, the preamble length in I-MAC can be minimized, which saves communication power. For further power reduction, a low-power crystal oscillator is also proposed, as a physical-layer design. We implemented I-MAC on an off-the-shelf sensor node to estimate the power saving, and verified that the proposed cross-layer design reduces 81% of the total power, compared to Low Power Listening.
Low Power Realization and Synthesis of Higher-Order FIR Filters Using an Improved Common Subexpression Elimination Method
K.G. SMITHA A.P. VINOD

PAPER-Digital Signal Processing

Vol:
E91-A No:11
Page(s):
3282-3292
The complexity of Finite Impulse Response (FIR) filters is mainly dominated by the number of adders (subtractors) used to implement the coefficient multipliers. It is well known that Common Subexpression Elimination (CSE) method based on Canonic Signed Digit (CSD) representation considerably reduces the number of adders in coefficient multipliers. Recently, a binary-based CSE (BSE) technique was proposed, which produced better reduction of adders compared to the CSD-based CSE. In this paper, we propose a new 4-bit binary representation-based CSE (BCSE-4) method which employs 4-bit Common Subexpressions (CSs) for implementing higher order low-power FIR filters. The proposed BCSE-4 offers better reduction of adders by eliminating the redundant 4-bit CSs that exist in the binary representation of filter coefficients. The reduction of adders is achieved with a small increase in critical path length of filter coefficient multipliers. Design examples show that our BCSE-4 gives an average power consumption reduction of 5.2% and 6.1% over the best known CSE method (BSE, NR-SCSE) respectively, when synthesized with TSMC-0.18 µm technology. We show that our BCSE-4 offers an overall adder reduction of 6.5% compared to BSE without any increase in critical path length of filter coefficient multipliers.
Power and Skew Aware Point Diffusion Clock Network
Gunok JUNG Chunghee KIM Kyoungkuk CHAE Giho PARK Sung Bae PARK

LETTER-Integrated Electronics

Vol:
E91-C No:11
Page(s):
1832-1834
This letter presents point diffusion clock network (PDCN) with local clock tree synthesis (CTS) scheme. The clock network is implemented with ten times wider metal line space than typical mesh networks for low power and utilized to nine times smaller area CTS execution for minimized clock skew amount. The measurement results show that skew amount of PDCN with local CTS is reduced to 36% and latency is shrunk to 45% of the amount in a 4.81 mm2 CortexA-8 core with 65 nm Samsung process.
4.8 GHz CMOS Frequency Multiplier Using Subharmonic Pulse-Injection Locking for Spurious Suppression
Kyoya TAKANO Mizuki MOTOYOSHI Minoru FUJISHIMA

PAPER

Vol:
E91-C No:11
Page(s):
1738-1743
To realize low-power wireless transceivers, it is necessary to improve the performance of frequency synthesizers, which are typically frequency multipliers composed of a phase-locked loop (PLL). However, PLLs generally consume a large amount of power and occupy a large area. To improve the frequency multiplier, we propose a pulse-injection-locked frequency multiplier (PILFM), where a spurious signal is suppressed using a pulse input signal. An injection-locked oscillator (ILO) in a PILFM was fabricated by a 0.18 µm 1P5M CMOS process. The core size is 10.8 µm10.5 µm. The power consumption of the ILO is 9.6 µW at 250 MHz, 255 µW at 2.4 GHz and 1.47 mW at 4.8 GHz. The phase noise is -105 dBc/Hz at a 1 MHz offset.

121-140hit(376hit)

Keyword Search Result

[Keyword] low power(376hit)

Low-Power Embedded Processor Design Using Branch Direction

An Approach for Reducing Leakage Current Variation due to Manufacturing Variability

Design of Voltage-Mode MAX-MIN Circuits with Low Area and Low Power Consumption

Ultra Low Power Delay Element with Post-Chip Adjustable Ability

Energy-Efficient Pre-Execution Techniques in Two-Step Physical Register Deallocation

Power Minimization for Dual- and Triple-Supply Digital Circuits via Integer Linear Programming

A Low-Power High Accuracy Over Current Protection Circuit for Low Dropout Regulator

A 0.31 pJ/Conversion-Step 12-Bit 100 MS/s 0.13 µm CMOS A/D Converter for 3G Communication Systems

Architectural Exploration and Design of Time-Interleaved SAR Arrays for Low-Power and High Speed A/D Converters

Design of 1 V Operating Fully Differential OTA Using NMOS Inverters in 0.18 µm CMOS Technology

A Low Power Reconfigurable Channel Filter Using Multi-Band and Masking Architecture for Channel Adaptation in Cognitive Radio

Design of Low Power QPP Interleave Address Generator Using the Periodicity of QPP

Reducing On-Chip DRAM Energy via Data Transfer Size Optimization

A Way Enabling Mechanism Based on the Branch Prediction Information for Low Power Instruction Cache

A Reference Voltage Buffer with Settling Boost Technique for a 12 bit 18 MHz Multibit/Stage Pipelined A/D Converter

Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages

Cross-Layer Design for Low-Power Wireless Sensor Node Using Wave Clock

Low Power Realization and Synthesis of Higher-Order FIR Filters Using an Improved Common Subexpression Elimination Method

Power and Skew Aware Point Diffusion Clock Network

4.8 GHz CMOS Frequency Multiplier Using Subharmonic Pulse-Injection Locking for Spurious Suppression

Latest Issue

Links

Call for Papers

Submit to IEICE Trans.

Transactions NEWS

Popular articles